Stacked Approximated Regression Machine: A Simple Deep Learning Approach
نویسندگان
چکیده
This paper proposes the Stacked Approximated Regression Machine (SARM), a novel, simple yet powerful deep learning (DL) baseline. We start by discussing the relationship between regularized regression models and feed-forward networks, with emphasis on the non-negative sparse coding and convolutional sparse coding models. We demonstrate how these models are naturally converted into a unified feed-forward network structure, which coincides with popular DL components. SARM is constructed by stacking multiple unfolded and truncated regression models. Compared to the PCANet, whose feature extraction layers are completely linear, SARM naturally introduces non-linearities, by embedding sparsity regularization. The parameters of SARM are easily obtained, by solving a series of light-weight problems, e.g., PCA or KSVD. Extensive experiments are conducted, which show that SARM outperforms the existing simple deep baseline, PCANet, and is on par with many state-of-the-art deep models, but with much lower computational loads. 1 Motivation and Overview This work was directly motivated by PCANet [5], which was designed to be a simple but competitive deep learning (DL) baseline. Classical deep neural networks (DNNs) [18] consist of stacked trainable stages, followed by a supervised loss. Each stage comprises a fully-connected or convolutional layer together with a nonlinear neuron function, and optionally with a feature pooling layer. Despite the remarkable performance breakthrough [24, 11], DNNs usually rely on time-consuming training by back propagation (BP), careful hyper-parameter tuning, and other empirical tricks [26]. In contrast, a PCANet comprises only very basic data processing components. A cascaded principal component analysis (PCA) is employed to learn multi-stage filter banks. There is no nonlinear operation involved until its very last layer, where binary hashing and histograms are conducted to compute the output features. Thus, this architecture can be learned in an efficient manner. The authors in [5] discovered the surprising fact, that a naive PCANet could compete with the state-of-the-art DNNs in a wide range of recognition and classification tasks. PCANet thus serves as a good baseline to empirically justify the usage of more sophisticated processing components for deep networks. Our goal is to design a novel DL baseline, which demonstrates more competitive performance in discriminative tasks, but remains as an easily trainable model as PCANet. We construct a feedforward feature learning hierarchy, by stacking truncated and regularized regression-type models, called Stacked Approximated Regression Machine (SARM). The architectures of many existing deep 29th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain. ar X iv :1 60 8. 04 06 2v 1 [ cs .L G ] 1 4 A ug 2 01 6 models [18, 24, 11] can be shown as special cases of SARM. Solving the parameters of SARM requires no BP; instead, they could be obtained by solving a series of light-weight problems, such as PCA or KSVD [7]. Compared to PCANet, whose feature extraction layers (stacked PCA filters) are completely linear, SARM naturally introduces non-linearities by enforcing sparsity regularization as a prior, which has been found to benefit discriminative tasks [6]. As shown in our study, SARM gains consistent performance advantages over PCANet. Further, SARM is on par with, sometimes even outperforms many state-of-the-art DNNs, with much lower computational loads. It proves to serve as a competitive baseline to many DL tasks. The derivation of SARM also provides instructive information on interpreting the working mechanism of deep networks.
منابع مشابه
A New Method for Detecting Ships in Low Size and Low Contrast Marine Images: Using Deep Stacked Extreme Learning Machines
Detecting ships in marine images is an essential problem in maritime surveillance systems. Although several types of deep neural networks have almost ubiquitously used for this purpose, but the performance of such networks greatly drops when they are exposed to low size and low contrast images which have been captured by passive monitoring systems. On the other hand factors such as sea waves, c...
متن کاملApplying deep learning on electronic health records in Swedish to predict healthcare-associated infections
Detecting healthcare-associated infections pose a major challenge in healthcare. Using natural language processing and machine learning applied on electronic patient records is one approach that has been shown to work. However the results indicate that there was room for improvement and therefore we have applied deep learning methods. Specifically we implemented a network of stacked sparse auto...
متن کاملLearning deep representations via extreme learning machines
Extreme learning machine (ELM) as an emerging technology has achieved exceptional performance in large-scale settings, and is well suited to binary and multi-class classification, as well as regression tasks. However, existing ELM and its variants predominantly employ single hidden layer feedforward networks, leaving the popular and potentially powerful stacked generalization principle unexploi...
متن کاملBuilding Energy Consumption Prediction: An Extreme Deep Learning Approach
Building energy consumption prediction plays an important role in improving the energy utilization rate through helping building managers to make better decisions. However, as a result of randomness and noisy disturbance, it is not an easy task to realize accurate prediction of the building energy consumption. In order to obtain better building energy consumption prediction accuracy, an extreme...
متن کاملMarginalized Stacked Denoising Autoencoders
Stacked Denoising Autoencoders (SDAs) [4] have been used successfully in many learning scenarios and application domains. In short, denoising autoencoders (DAs) train one-layer neural networks to reconstruct input data from partial random corruption. The denoisers are then stacked into deep learning architectures where the weights are fine-tuned with back-propagation. Alternatively, the outputs...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1608.04062 شماره
صفحات -
تاریخ انتشار 2016